vLLM OpenAI API server supports recording experience data by pan-x-c · Pull Request #591 · agentscope-ai/Trinity-RFT

pan-x-c · 2026-06-24T12:12:44Z

Description

As the title says

Checklist

Please check the following items before code is ready to be reviewed.

Code has passed all tests
Docstrings have been added/updated in Google Style
Documentation has been updated
Code is ready for review

…th legacy) Refactor experience production so heavy data (tokens/logprobs/routed_experts) no longer rides runner->scheduler->coordinator as serialized bytes. The vLLM recorder now captures it in-process into a MemoryStore keyed by task_id, and the coordinator pulls it at finalize time via /records/consume_task. Runners ship only a small reward map. Both paths coexist behind `explorer.use_recorded_experience` (default off = legacy). Recording module (trinity/common/models/vllm_patch/recording/): - store: drop SqlStore; MemoryStore.update_reward_by_task_id stamps reward/run/task on a whole task-id group, pops and returns it (the in-memory replacement for the SQL HistoryRecorder join). - recorder: track in-flight record tasks; add flush() (await pending + queue.join) so a consume sees a quiesced store; honor skip_recording_ctx. - models: build_experience emits one Experience per completion (n>1) with info["sample_index"]; eid.suffix=request_id kept for traceability. - context: add skip_recording_ctx; task_id already flows via api_key (RecordingIdentityMiddleware) and now also via VLLMModel.chat (Ray entry). - query: POST /records/consume_task (flush -> update_reward_by_task_id -> serialize_many); drop the SqlStore 503 branch. - config/server: remove RecordingConfig entirely; the logprob width is a recorder-internal constant (we store only the chosen token, which vLLM force-includes at logprobs=1). No static config threaded through launch. task_id propagation (Ray entry, same contextvar as the HTTP middleware): - vllm_model: chat/generate accept task_id_key, set task_id_ctx around _generate_internal; logprobs sets skip_recording_ctx (auxiliary forward). - model: ModelWrapper.chat/chat_async forward task_id_key; SGLang.chat accepts-and-ignores it (recording is vLLM-only). Coordinator + runner + workflow: - rollout_coordinator: _resolve_rank_urls (ray.get_actor per engine) and a recording-mode finalize that fans out /records/consume_task per engine, deserializes, and feeds objects to the pipeline (no re-serialization). - experience_pipeline: process_experiences(exps) public object entry. - workflow_runner: recording mode returns a pickled reward map keyed by the per-sample task_id_key the workflow stamped; legacy path unchanged. - workflow: SimpleWorkflow/AsyncSimpleWorkflow run a per-sample n=1 loop in recording mode (distinct task_id_key per sample == reward unit for GRPO), legacy n=repeat_times single-call path unchanged. - config: ExplorerConfig.use_recorded_experience flag. SQL path removal (MemoryStore only): - delete proxy/recorder.py (HistoryRecorder) and proxy_test.py; proxy service/app drop /feedback, /commit, record_feedback, submit_experiences, ready_experiences (keep allocate_model + weight sync); allocator no longer fills record_db_url; drop InferenceModelConfig.record_db_url and the dead ExplorerConfig.db_url field; RecordingConfig deleted. Serve-mode external reward reporting is intentionally left unimplemented this version (proxy /feedback//commit removed); the affected serve integration tests (TestServeWithTrainer, ServeTest) are skipped with a pointer to the recording refactor plan. convert_messages_to_experience redirect (multi-turn) is deferred with TODOs at its call sites. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…-x-c/Trinity-RFT into feature/model_self_record_experience

pan-x-c · 2026-07-01T03:19:25Z

/unittest-module-common

github-actions · 2026-07-01T04:01:45Z

unittest: Run #1798

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Pending ⏳	Other ❓	Flaky 🍂	Duration ⏱️
97	91	5	1	0	0	0	41m 34s

❌ Some tests failed!

Name	Failure Message
❌ tests/common/vllm_test.py::TestAPIServerCommon::test_api	The test failed in the call phase due to an assertion error
❌ tests/common/vllm_test.py::TestQwen35APIServerMultiModal::test_multi_modal_content	The test failed in the call phase
❌ tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	The test failed in the call phase due to an assertion error
❌ tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	The test failed in the call phase due to an assertion error
❌ tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	The test failed in the call phase due to an assertion error

Github Test Reporter by CTRF 💚

pan-x-c · 2026-07-01T04:28:45Z

/unittest-module-common

github-actions · 2026-07-01T05:13:18Z

unittest: Run #1799

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Pending ⏳	Other ❓	Flaky 🍂	Duration ⏱️
97	96	0	1	0	0	0	43m 48s

🎉 All tests passed!

Github Test Reporter by CTRF 💚

pan-x-c · 2026-07-01T06:46:39Z

/unittest-module-trainer

chenyushuo · 2026-07-01T07:15:14Z

+            completed_runs=self.__class__.can_repeat and self.repeat_times or 1,
+            total_runs=self.__class__.can_repeat and self.repeat_times or 1,


May not need __class__

chenyushuo · 2026-07-01T07:18:04Z

        self.repeat_times = repeat_times
-        self.task.rollout_args.n = repeat_times
        self.run_id_base = run_id_base
+        self.task.rollout_args.n = repeat_times


Suggested change

self.repeat_times = repeat_times

self.task.rollout_args.n = repeat_times

self.run_id_base = run_id_base

self.task.rollout_args.n = repeat_times

supper().set_repeat_times(repeat_times, run_id_base)

self.task.rollout_args.n = repeat_times

chenyushuo · 2026-07-01T07:34:24Z

+    # ``enable_router_replay`` (mirrored to ``enable_return_routed_experts`` in
+    # ``config_validator``); it is not implied by ``enable_history``, so dense
+    # models can record history too.
    enable_history: bool = False


may set to True

chenyushuo · 2026-07-01T07:34:38Z

    enable_history: bool = False

    # For OpenAI API
    enable_openai_api: bool = False


may set to True

chenyushuo · 2026-07-01T07:35:33Z

+    response_text: str = ""  # Text of the response
    prompt_text: Optional[str] = None  # Text of the prompt


Can response_text and prompt_text be the same default value

pan-x-c · 2026-07-01T13:42:28Z

/unittest-module-trainer

github-actions · 2026-07-01T15:00:39Z

unittest: Run #1803

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Pending ⏳	Other ❓	Flaky 🍂	Duration ⏱️
29	25	0	4	0	0	0	1h 17m

🎉 All tests passed!

Github Test Reporter by CTRF 💚

openai api server record experience data

776ae71

pan-x-c marked this pull request as draft June 24, 2026 12:26

pan-x-c and others added 28 commits June 25, 2026 11:15

simplify

5fe9053

add config

180b416

use api key as session id

f27988b

clean stale header

40b53e8

simplify code

4ae9392

unify vllm experience recording

4f0f0a7

add tests

8ce1f4b

add log

02eb4e4

fix middleware

db49262

update interface

e6ae7dc

fix prompt text

63a5404

fix streaming recorder

5ff21b2

add delta stream

6a0ad44

refactor history recording

bb2f22c

sglang self recording experiences

bf931de

add sglang tests

a274ed3

fix sglang tests

35952ae

remove enable recording

6d6ce69

Merge branch 'feature/model_self_record_experience' of github.com:pan…

034fb96

…-x-c/Trinity-RFT into feature/model_self_record_experience

fix models

82d75ba

add recording server

5cc3f05

fix sglang

936da53

remove redundant fields

2228d65

fix vllm test

3122b55

fix tests

579d189

add store

052562a

refactor store

d8f3de6

record

52c2296

pan-x-c added 2 commits July 1, 2026 11:29

simplify pipeline

9c57277

fix logprobs

ed4d2d8

pan-x-c added 4 commits July 1, 2026 12:03

fix reasoning parser

8864c36

fix toolcall

0d359a4

update recording context

14b7262

fix pre-commit

892cefc

pan-x-c added 3 commits July 1, 2026 13:31

clean scheduler

da9b0f1

clean coordinator

82b1544

block finished batch

98430b4

pan-x-c marked this pull request as ready for review July 1, 2026 06:46

chenyushuo reviewed Jul 1, 2026

View reviewed changes

pan-x-c added 6 commits July 1, 2026 19:57

update workflow doc

1146972

clean build experience

0ccf77c

clean build experience

815bd46

fix model version

0571edb

fix model version drift

14d5bdc

fix workflow reset

740805a

pan-x-c force-pushed the feature/model_self_record_experience branch from c85db9e to 740805a Compare July 1, 2026 13:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vLLM OpenAI API server supports recording experience data#591

vLLM OpenAI API server supports recording experience data#591
pan-x-c wants to merge 79 commits into
agentscope-ai:mainfrom
pan-x-c:feature/model_self_record_experience

pan-x-c commented Jun 24, 2026

Uh oh!

pan-x-c commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

pan-x-c commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

pan-x-c commented Jul 1, 2026

Uh oh!

chenyushuo Jul 1, 2026

Uh oh!

chenyushuo Jul 1, 2026

Uh oh!

chenyushuo Jul 1, 2026

Uh oh!

chenyushuo Jul 1, 2026

Uh oh!

chenyushuo Jul 1, 2026

Uh oh!

pan-x-c commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		completed_runs=self.__class__.can_repeat and self.repeat_times or 1,
		total_runs=self.__class__.can_repeat and self.repeat_times or 1,

		response_text: str = "" # Text of the response
		prompt_text: Optional[str] = None # Text of the prompt

Uh oh!

Conversation

pan-x-c commented Jun 24, 2026

Description

Checklist

Uh oh!

pan-x-c commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

unittest: Run #1798

❌ Some tests failed!

Uh oh!

pan-x-c commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

unittest: Run #1799

🎉 All tests passed!

Uh oh!

pan-x-c commented Jul 1, 2026

Uh oh!

chenyushuo Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

chenyushuo Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

chenyushuo Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

chenyushuo Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

chenyushuo Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

pan-x-c commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

unittest: Run #1803

🎉 All tests passed!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants